Utilizing interband acoustical information for modeling stationary time-frequency regions of noisy speech

نویسنده

  • Chang Dong Yoo
چکیده

A novel enhancement system is developed that exploits the properties of stationary regions localized in both time and frequency. This system selects stationary time-frequency (TF) regions and adaptively enhances each region according to its local signal-tonoise ratio (LSNR) while utilizing both the acoustical knowledge of speech and the masking properties of the human auditory system. Each region is enhanced for maximum noise reduction while minimizing distortion. This paper evaluates the proposed system through informal listening tests and some objective measures.

برای دانلود متن کامل این مقاله و بیش از 32 میلیون مقاله دیگر ابتدا ثبت نام کنید

ثبت نام

اگر عضو سایت هستید لطفا وارد حساب کاربری خود شوید

منابع مشابه

An Information-Theoretic Discussion of Convolutional Bottleneck Features for Robust Speech Recognition

Convolutional Neural Networks (CNNs) have been shown their performance in speech recognition systems for extracting features, and also acoustic modeling. In addition, CNNs have been used for robust speech recognition and competitive results have been reported. Convolutive Bottleneck Network (CBN) is a kind of CNNs which has a bottleneck layer among its fully connected layers. The bottleneck fea...

متن کامل

A glimpsing model of speech perception in noise.

Do listeners process noisy speech by taking advantage of "glimpses"-spectrotemporal regions in which the target signal is least affected by the background? This study used an automatic speech recognition system, adapted for use with partially specified inputs, to identify consonants in noise. Twelve masking conditions were chosen to create a range of glimpse sizes. Several different glimpsing m...

متن کامل

Stochastic perceptual models of speech

We have recently developed a statistical model of speech that avoids a number of current constraining assumptions for statistical speech recognition systems, particularly the model of speech as a sequence of stationary segments consisting of uncorrelated acoustic vectors. We further wish to focus statistical modeling power on perceptually-dominant and information-rich portions of the speech sig...

متن کامل

An evaluation of objective measures for intelligibility prediction of time-frequency weighted noisy speech.

Existing objective speech-intelligibility measures are suitable for several types of degradation, however, it turns out that they are less appropriate in cases where noisy speech is processed by a time-frequency weighting. To this end, an extensive evaluation is presented of objective measure for intelligibility prediction of noisy speech processed with a technique called ideal time frequency (...

متن کامل

Temporal resolution analysis in frequency domain linear prediction.

Frequency domain linear prediction (FDLP) is a technique for auto-regressive modeling of Hilbert envelopes. In this letter, the resolution properties of the FDLP model are investigated using synthetic signals with impulses immersed in noise. The effect of various factors are studied which affect the temporal resolution and this analysis suggests ways to improve the resolution of the FDLP envelo...

متن کامل

ذخیره در منابع من


  با ذخیره ی این منبع در منابع من، دسترسی به آن را برای استفاده های بعدی آسان تر کنید

برای دانلود متن کامل این مقاله و بیش از 32 میلیون مقاله دیگر ابتدا ثبت نام کنید

ثبت نام

اگر عضو سایت هستید لطفا وارد حساب کاربری خود شوید

عنوان ژورنال:

دوره   شماره 

صفحات  -

تاریخ انتشار 1999